首页> 外文OA文献 >Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model

【2h】

Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model

机译：具有单个序列到序列的多方言语音识别模型

代理获取

本网站仅为用户提供外文OA文献查询和代理获取服务，本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文，但由于OA文献来源多样且变更频繁，仍可能出现获取不到、文献不完整或与标题不符等情况，如果获取不到我们将提供退款服务。请知悉。

页面导航

摘要
著录项
相似文献
相关主题

摘要

Sequence-to-sequence models provide a simple and elegant solution forbuilding speech recognition systems by folding separate components of a typicalsystem, namely acoustic (AM), pronunciation (PM) and language (LM) models intoa single neural network. In this work, we look at one such sequence-to-sequencemodel, namely listen, attend and spell (LAS), and explore the possibility oftraining a single model to serve different English dialects, which simplifiesthe process of training multi-dialect systems without the need for separate AM,PM and LMs for each dialect. We show that simply pooling the data from alldialects into one LAS model falls behind the performance of a model fine-tunedon each dialect. We then look at incorporating dialect-specific informationinto the model, both by modifying the training targets by inserting the dialectsymbol at the end of the original grapheme sequence and also feeding a 1-hotrepresentation of the dialect information into all layers of the model.Experimental results on seven English dialects show that our proposed system iseffective in modeling dialect variations within a single LAS model,outperforming a LAS model trained individually on each of the seven dialects by3.1 ~ 16.5% relative.

机译：序列到序列模型通过将典型系统的单独组件（即声学（AM），发音（PM）和语言（LM）模型）折叠到单个神经网络中，为构建语音识别系统提供了一种简单而优雅的解决方案。在这项工作中，我们将研究一个这样的序列到序列模型，即听，出席和拼写（LAS），并探讨为单个模型服务于不同的英语方言的可能性，这简化了在不使用英语的情况下训练多方言系统的过程。每个方言需要单独的AM，PM和LM。我们表明，仅将来自所有方言的数据合并到一个LAS模型中，就落后于每种方言的模型微调性能。然后，我们研究将特定于方言的信息整合到模型中，既可以通过在原始字素序列末尾插入方言符号来修改训练目标，也可以将方言信息的1-hotrepresent馈入模型的所有层中。在七个英语方言上的研究表明，我们提出的系统可以有效地在单个LAS模型中对方言变化进行建模，其相对于在七个方言中分别训练的LAS模型要好3.1到16.5％。

著录项

作者
Li, Bo; Sainath, Tara N.; Sim, Khe Chai; Bacchiani, Michiel; Weinstein, Eugene; Nguyen, Patrick; Chen, Zhifeng; Wu, Yonghui; Rao, Kanishka;
展开▼
作者单位

展开▼
年度 2017
总页数
原文格式 PDF
正文语种
中图分类

相似文献

外文文献
中文文献
专利

1. An open speech resource for Tibetan multi-dialect and multitask recognition [J] . International Journal of Computational Science and Engineering . 2020,第2a3期

机译：藏多方面和多任务识别的开放式语音资源
2. Tibetan Multi-Dialect Speech and Dialect Identity Recognition [J] . Yue Zhao, Jianjian Yue, Wei Song, Computers, Materials & Continua . 2019,第3期

机译：西藏多方面言语和方言识别识别
3. Tibetan Multi-Dialect Speech Recognition Using Latent Regression Bayesian Network and End-To-End Mode [J] . Yue Zhao, Jianjian Yue, Wei Song, 物联网杂志(英文) . 2019,第001期

机译：藏族多方面语音识别使用潜在回归贝叶斯网络和端到端模式
4. Multi-Dialect Speech Recognition with a Single Sequence-to-Sequence Model [C] . Bo Li, Tara N. Sainath, Khe Chai Sim, IEEE International Conference on Acoustics, Speech and Signal Processing . 2018

机译：具有单个序列到序列模型的多方面语音识别
5. Neural Network Based Representation Learning and Modeling for Speech and Speaker Recognition [D] . Guo, Jinxi. 2019

机译：基于神经网络的语言和扬声器识别的模拟
6. Deep Learning Techniques for Speech Emotion Recognition from Databases to Models [O] . Babak Joze Abbaschian, Daniel Sierra-Sosa, Adel Elmaghraby 2021

机译：语音情感认可的深度学习技术从数据库到模型
7. On the Choice of Modeling Unit for Sequence-to-Sequence Speech Recognition [O] . Kazuki Irie, Rohit Prabhavalkar, Anjuli Kannan, 2019

机译：关于序列到序列语音识别建模单元的选择

Multi-Dialect Speech Recognition With A Single Sequence-To-Sequence Model

摘要

著录项

相似文献

相关主题

期刊订阅